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Art Unit: 2623 

(1) Real Party in Interest 

A statement identifying by name the real party in interest is contained in the brief 
This is in response to the appeal brief filed 01/20/06 appealing from the Final Office 
action mailed August 24, 2005. 

(2) Related Appeals and Interferences 

The examiner is not aware of any related appeals, interferences, or judicial proceedings 
which will directly affect or be directly affected by or have a bearing on the Board's decision in 
the pending appeal. 

(3) Status of Claims 

The statement of the status of claims contained in the brief is correct. 

(4) Status of Amendments After Final 

The appellant's statement of the status of amendments after final rejection contained in 
the brief is correct. 

(5) Summary of Claimed Subject Matter 

The summary of claimed subject matter contained in the brief is correct. 

(6) Grounds of Rejection to be Reviewed on Appeal 

The appellant's statement of the grounds of rejection to be reviewed on appeal is correct. 

(7) Claims Appendix 

The copy of the appealed claims contained in the Appendix to the brief is correct. 

(8) Evidence Relied Upon 

6,546,555 B 1 HJELS VOLD ET AL. 04-2003 

6,463,444 Bl JAIN ET AL. 10-2002 
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(9) Grounds of rejection: Detailed Action of the Final office action is included herein 
for the Board of Appeals members conveniently review and examine without having to refer 
back and forth between this examiner answer and the final office action. 

DETAILED ACTION 
Claim Rejections - 35 USC 102 
The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that form the 
basis for the rejections under this section made in this Office action: 
A person shall be entitled to a patent unless ~ 

(e) the invention was described in (1) an application for patent, published under section 
122(b), by another filed in the United States before the invention by the applicant for 
patent or (2) a patent granted on an application for patent by another filed in the United 
States before the invention by the applicant for patent, except that an international 
application filed under the treaty defined in section 351(a) shall have the effects for 
purposes of this subsection of an application filed in the United States only if the 
international application designated the United States and was published under Article 
21(2) of such treaty in the English language. 

Claims 1-10, and 18-25 are rejected under 35 U.S.C. 102(e) as being anticipated by 
Hjelsvold et al. (U.S. Patent No. 6,546,555 Bl). 

Regarding claim 1, Hjelsvold discloses "a method for processing video, the method 
comprising: determining an association between a first video segment including a particular 
feature and at least one additional information source also including that feature; and utilizing the 
association to display information from the additional information source based at least in part on 
a selection by a user of the feature in the first video segment while the video segment is 
displayed to the user ", i.e., video segments are delivered to the viewer, while viewing the 
programming segment with a particular feature, the viewer further access to related information 
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to that feature from an additional information source of a vendor for that particular product or 
service based on the defined association between the video segment and related information 
sequences (Figs. 16-18, and col. 2/line 58 to col. 3/line 32 for hypervideo to link to additional 
information related to a feature of a product or a service; and col. 1 1/line 16 to col. 1 2/line 10 for 
further details on the determination of association between parameter values and hyperlink and 
hypervideo ). 

As for claim 2, in view of claim 1, Hjelsvold discloses "wherein determining the 
association further includes the step of retrieving the association from a memory" (Fig. 1, for the 
server 10 retrieves the meta-data from a database for filtering based on related features as 
explained earlier). 

As for claims 3 and 4, in view of claim 1, Hjelsvold discloses "wherein determining the 
association fiirther includes determining the association from information in a portion of the 
video segment", i.e., a portion of a video segment as individual shots, scenes and sequences can 
be determined, requested and retrieved (Figs. 4-5, and col. 6/line 65 to col. 7/line 27); and 
Hjelsvold further discloses "wherein the additional information source comprises an additional 
video segment also including the feature" (as already discussed in claim 1). 

As for claims 5 and 6, in view of claim 4, Hjelsvold discloses "wherein utilizing the 
association includes switching from display of the first video segment to display of the additional 
video segment also including the feature" (as shown in Figs. 16-18, for the display of the next 
screen including the video feature of the feature of a product or service); and Hjelsvold further 
discloses "wherein utilizing the association includes displaying the additional video segment at 
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least in part in a separate portion of a display which also includes at least a portion of the first 
video segment" (Figs. 17-18, and col. 12/lines 1-33). 

As for claim 7, in view of claim 1, Hjelsvold further discloses "wherein the feature is a 
video feature extracted from at least one frame of the video segment", i.e., the selected target is 
at least one frame of the video segment as a window frame of image (Fig. 17 and col. 12/line 1- 
33 clearly show the next additional information is extracted from at least one frame of the video 
sequences, as discussed earlier in Figs. 4-5 for the building of an association between parameter 
values for video sequences within the filtering process). 

As for claim 8, in view of claim 7, Hjelsvold discloses "wherein the video feature 
conprises at least one of a frame characterization, a face identification, a scene identification, an 
event identification, and an object identification" (Figs. 14, and 16-18 for these features). 

As for claims 9 and 10, in view of claim 1, Hjelsvold further discloses "wherein the 
featxire is an audio feature extracted from at least one frame of the video segment" and "wherein 
utilizing the association includes combining an audio signal corresponding to the audio feature 
with an audio signal associated with the first video segment", i.e., as the user selects a desired 
target, the video segment including audio tracks related to the selected portion of feature is 
provided (col. 4/lines 51-64 for multimedia includes video, audio tracks and other objects). 

As for claims 18-25, these claims with same limitations are rejected for the reasons given 
in the scope of claims 1-10 as already discussed in details above. 
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Claim Rejections - 35 USC 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 

obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or 
described as set forth in section 102 of this title, if the differences between the subject 
matter sought to be patented and the prior art are such that the subject matter as a whole 
would have been obvious at the time the invention was made to a person having ordinary 
skill in the art to which said subject matter pertains. Patentability shall not be negatived 
by the manner in which the invention was made. 

Claims 1 1-16 are rejected under 35 U.S.C. 103(a) as being unpatentable over Hjelsvold et 
al. (U.S. Patent No, 6,546,555 Bl) in view of Jain et al. (U.S. Patent No. 6,463,444 Bl). 

Regarding claim 11, in view of claim 9, Hjelsvold does not further disclose "wherein 
utilizing the association includes converting an audio signal corresponding to the audio feature 
into a textual format which is displayed with the first video segment"; however, such a technique 
of converting audio signal to a textual format or speech-to-text feature is known in the art. In 
fact, Jain, in a video cataloger system for providing video/audio information data to the user, 
teaches to use a closed caption decoder (Fig, 3) or speech-to-text converting technique for 
providing a textual format to display with the video to the user (Fig. 9, item 518, and col. 9/line 
45 to col. 10/line 38 for audio feature extractors). Therefore, it would have been obvious to one 
of ordinary skill in the art at the time the invention was made to modify Hjelsvold' system with 
Jain's teaching technique as disclosed in order to provide an additional feature such as a textual 
format in additional to the display of video presentation. This technique is helpful for some 
people have difficulty in hearing, so that they can read the texts on the display screen instead, 
which serves also as a motivation for modifying Hjelsvold regarding this limitation. 
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As for claim 12, in view of claims 9 and 1 1 above, Jain further including "separating at 
least a portion of the video segment into audio categories including one or more of single-voice 
speech, multiple voice speech, music, silence and noise in order to extract the audio feature 
therefrom" (see Fig. 6 and col. 9/line 45 to col. 10/line 38 for a monitoring screen for separating 
a portion of video segment into audio categories and audio feature extractors as addressed). 

As for claim 13, in view of claims 9 and 1 1 above, Jain teaches "wherein the audio 
feature comprises at least one of a music signature extraction, a speaker identification, and a 
transcript extraction", i.e., music, and/or speaker ID, signatures or sample speeches of individual 
speaker or transcripts from the speaker are within audio feature addressed (see col. 9/line 18-coL 
10/line 38). 

As for claim 14, in view of claim 1, the combination of Hjelsvold and Jain teaches 
"wherein the feature is a textual feature extracted from at least one frame of the video segment", 
i.e., applied Jain's technique of textual feature extracted, the at least one frame of the video 
segment as discussed earlier of Hjelsvold would contain the textual feature (see claims 1, 7 and 
11), 

As for claim 15, in view of claim 14, Jain further discloses "wherein utilizing the 
association includes displaying information corresponding to the textual information as an 
overlay on a display of the first video segment" (as illustrated in Fig. 17, and col. 14/Unes 15- 
63). 

As for claim 16, in view of claims 1 and 14, Jain further teaches "wherein determining 
the association further includes determining the association based at least in part on at least one 
multi-dimensional feature vector extracted from a portion of the video segment using a feature 
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extraction technique" (Fig. 14, and col. 12/lines 20-46 for feature extraction technique 
addressed). 

Claim 17 are rejected under 35 U.S.C. 103(a) as being unpatentable over Hjelsvold et al. 
(U.S. Patent No. 6,546,555 Bl) in view of the application's specification (page 9, line 19 to page 
10/line 13). 

As for claim 17, in view of claim 1, Hjelsvold does not discloses "wherein determining 
the association further includes determining the association based at least in part on at least one 
of a similarity measure and a clustering technique"; however, this limitation is admitted as prior 
art by the Applicant (page 9, line 19 to page 10/line 13). Therefore, it would have been obvious 
to one of ordinary skill in the art at the time the invention was made to modify Hjelsvold' 
technique with a known prior art using similarity measure and a clustering technique for 
determining the association or the relationship in the determining step of claim 1, for the purpose 
of providing same information to a group of users with similarity interests on a certain product or 
service as preferred. 
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(10) Examiner's Arguments: 

Claims 1-10 and 19-25: (a user, or a viewer, or a client can be used interchangeably 
herein by the examiner for referring to the user at the client side, as illustrated in Figs. 13-14 of 
Hjelsvold). 

la) The appellants argue that Hjelsvold does not teach "determining an association 
between a first video segment including a particular feature and at least one additional 
information source also including that feature". 

First, one would wonder what is "an association" that the appellants refer to? In pages 6- 
7 of the specification, it is simply referred to as "a corresponding physical link" between two 
entities or features. It may imply other things, but on page 7, 2^^ paragraph, it broadly calls for 
the physical hnk itself. One of ordinary skill in the art understands that "a physical Unk" is a 
connection between two entities, either in wired or wireless connection, i.e., the use of twisted 
wired pair, fiber optic cable, coaxial cable, radio frequency or RF, or the Internet connection 
using the existing telephone wires for dialing up or a cable modem for high speed link. Thus, it 
is clear that Hjelsvold's system, Figs. 14-18, shows an intemet connection from the clients to 
streaming servers, and that this connection represents "an association" for having a physical link, 
and a (first) video segment(s) including a particular feature (a product), as shown in Figs. 16-17, 
a video scene of a product (a particular feature) with its corresponding more product information 
(additional information source), and this additional product information (more video information 
about the product or price) is obtained from the vendor. Claim 1 in the action was also clear in 
explaining on the hyperlink for linking to hypervideo, which also reads on the claiming feature 
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because the Internet linking connection by using a hyperlink for further product information 
defined "the association between two entities". Then the displaying information to the user 
occurs based on the use of the association as already discussed in the office action (refer to Fig. 
17), and there is no need to repeat it here. 

lb) The appellants argue that Hjelsvold does not teach the analyze feature. 

Furthermore, the appellant mistakenly notes that Hjelsvold does not teach "analyze the 
video segment to determine an association to provide such a link.. ." (page 10 of 22, 1^^ par.), the 
examiner would like to point out to all illustrations in Figs. 4-8 and column 11, lines 25-67 for 
video segments are identified, defined, analyzed based on types and parameters and the filtering 
process handles the task for matching the linking process to appropriate additional information 
features; otherwise, the user can not get the appropriate links for addition video scene or 
additional product information related to the displaying product, refer to Fig. 17 again. 

2) The appellants argue that Hjelsvold does not teach extracting a video feature 
from a frame (claims 7-8). 

Please take a closer look at colunm 6, line 50 to column 7, line 1 1 for the extracting 
feature as individual shots, scenes, and narrative sequences of scenes are analyzed, indexed, and 
determined in order to generate the filtering meta-data needed for further processing and 
identifying purposes. 
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3a) The appellants argue that Hjelsvold does not teach extractmg an audio feature 
from a frame (claims 9), 

In addition to column 4, lines 51-64 in the office action for audio tracks and closed 
captioning is identified, and it teaches the audio information data is extracted in the form of texts 
(or textual feature) for visual displaying in case the viewer can not hear the audio feature; the 
video segments include video and audio information, not only referred to video but also to audio. 

3b) The appellants argue that Hjelsvold does not teach mixing or merging the audio 
among the selected segments (claim 10). 

Understanding items 3 and 4 above, one would easily realizes that as the user selects a 
desired target (for a product), the video segment including audio tracks related to the selected 
portion of feature (of that product with additional video information is provided (col. 4/lines 51- 
64 for multimedia includes video, audio tracks and other objects), and further details on column 
10, line 48 to column 11, Hne 15 as the synchronization between tracks are doing the mixing 
and/or merging different tracks (of video including audio tracks) based on parameters for 
displaying the particular hypervideo stream to the user. 
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5) The appellants argue that Hjelsvold and Jain fail to teach or suggest extracting a 
textual feature from a frame of the video segment (claim 14-15). 

The combination of Hjelsvold and Jain teach extracting a textual feature from a frame of 
the video segment, i.e., applied Jain's technique of textual feature extracted, the at least one 
frame of the video segment as discussed earUer of Hjelsvold would contain the textual feature. 
Furthermore, please take a closer look of Hjelsvold at column 6, Une 50 to column 7, line 1 1 for 
the extracting featiire as individual shots, scenes, and narrative sequences of scenes are analyzed, 
indexed, and determined in order to generate the fdtering meta-data needed for further 
processing and identifying purposes. 

6) The appellants argue that Hjelsvold and Jain fail to teach or suggest determining 
an association based at least in part on at least one multi-dimentional feature vector 
extracted from a portion of the video segment using a feature extraction technique (claim 
16). 

In the office action, Figure 14 and the cited paragraphs of Jain is for the flow chart of 
how the feature extraction is done. In order to brighten up the issue, the examiner would like to 
further point out to Figure 7 and column 7/lines 10-29 as Jain clearly shows the relationship (or 
"an association") of video segment or portion of metadata as being indexed by the feature 
extractors. Those indexed or identified video IDs are associated together as sequences. 
Therefore, Jain reads on this feature. 
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7) The appellants argue that Hjelsvold fails to teach determining an association 
between a first video segment including a particular feature and at least one additional 
information source also including that feature (claim 17). 

As discussed earlier, one would wonder what is "an association" that the appellants refer 
to? In pages 6-7 of the specification, it is simply referred to as "a corresponding physical link" 
between two entities or features. It may imply other things, but on page 7, 2°^ paragraph, it 
broadly calls for the physical link itself One of ordinary skill in the art understands that "a 
physical link" is a connection between two entities, either in wired or wireless connection, i.e., 
the use of twisted wired pair, fiber optic cable, coaxial cable, radio fi*equency or RF, or the 
Internet connection using the existing telephone wires for dialing up or a cable modem for high 
speed link. Thus, it is clear that Hjelsvold's system, Figs. 14-18, shows an internet connection 
fi-om the clients to streaming servers, and that this connection represents "an association" for 
having a physical link, and a (first) video segment(s) including a particular feature (a product), as 
shown in Figs. 16-17, a video scene of a product (a particular feature) with its corresponding 
more product information (additional information source), and this additional product 
information (more video information about the product or price) is obtained from the vendor. 
Claim 1 in the office action was also clear in explaining on the hyperlink for Unking to 
hypervideo, which also reads on the claiming feature because the Intemet linking connection by 
using a hyperhnk for further product information defined "the association between two entities". 
Then the displaying information to the user occurs based on the use of the association as already 
discussed in the office action (refer to Fig. 17), and there is no need to repeat it here. 
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5) The appellants argue that Hjelsvold does not teach controlling a display based on 
a selection by a user of the feature in the first video segment while the video segment is 
displayed to the user (claim 18). 

Please refer to column 7, lines 47-64, Hjelsvold teaches the user can access and selects 
the feature representing to the user/viewer, and refer to item 5 above on the synchronization and 
filtering process for displaying streaming video segments to the user/viewer/client. 

7) Prior Art Admitted 

In addition, the limitation of claim 17 is admitted as prior art submitted by the Applicant 
(page 9, line 19 to page 10, line 13). 

For the above reasons, it is believed that the rejections should be sustained. 
Respectfully submitted, 
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